perf: Enhance Memory Management with Lock-Free Allocator, Preallocation, and Optimized Thread-Local Caching #2825

beats-dh · 2024-08-17T05:14:15Z

Detailed Description:

1. Introduction of Static Preallocation:

Implementation of preallocate Method: Added the ability to preallocate a fixed number of memory blocks (STATIC_PREALLOCATION_SIZE = 100) during the initialization of LockfreePoolingAllocator. This reduces the need for dynamic allocation during runtime and improves system efficiency.
Use of std::call_once: Ensures that preallocation is performed only once, safely, and at runtime using std::call_once and std::once_flag.

2. Optimization of Thread-Local Cache:

New Local Cache Limit (LOCAL_CACHE_LIMIT): Introduced a dynamic calculation to adjust the size of the thread-local cache based on the number of threads available in the system (TOTAL_THREADS). The local cache is adjusted to ensure that each thread has an adequate cache size, preventing excessive memory usage in environments with many threads. The default value is std::max(35 / TOTAL_THREADS, 5).
Thread-Local Storage (thread_local): The local cache is stored in a thread_local variable, ensuring that each thread maintains its own separate cache, improving efficiency by avoiding contention.

3. Improvements in the `allocate` Function:

Simplification of Control Flow: The code has been simplified by removing empty blocks and better organizing allocation checks. If the local cache is empty, the function attempts to retrieve blocks from the lock-free shared list; otherwise, a new memory block is dynamically allocated.

4. Improvements in the `deallocate` Function:

Enhanced Memory Management: The deallocation function has been optimized to efficiently reuse memory blocks, prioritizing the local cache before returning blocks to the lock-free shared list.

Reason for Replacing `std::make_shared` with This New Implementation:

1. Precise Control of Allocation and Deallocation:

The primary reason for replacing std::make_shared with the new implementation using LockfreePoolingAllocator is the need for finer and more efficient control over memory allocation and deallocation. std::make_shared combines object and reference counter allocation into a single operation, which is efficient in terms of memory usage, but does not offer the flexibility required for a system that demands specific optimizations like thread-local caching and memory block preallocation.

2. Optimization for Multithreaded Environments:

In a multithreaded environment, the new implementation allows each thread to maintain its own local memory block cache, reducing contention when accessing shared resources. std::make_shared, on the other hand, does not provide mechanisms to leverage these optimizations, making it less efficient in scenarios where frequent object allocation and deallocation occur.

3. Reusing Memory Blocks:

With the new implementation, memory blocks can be reused from both a thread-local cache and a lock-free shared list, depending on availability. This not only improves performance by minimizing dynamic allocations, but also offers more predictable and efficient memory management, especially under high load.

4. Reduction of Memory Fragmentation:

By using the new allocation strategy, there is a significant reduction in memory fragmentation. This is due to the ability to preallocate and efficiently reuse memory blocks, something that std::make_shared does not allow in a granular manner.

5. Flexibility and Extensibility:

The new approach also offers greater flexibility for future optimizations and adjustments according to the application's needs. The LockfreePoolingAllocator implementation allows customizations such as the amount of preallocated memory, the size of the local cache, and the behavior of the lock-free list, aspects that cannot be easily managed with std::make_shared.

In summary, switching to this new implementation provides greater control and efficiency in memory management, improving application performance in environments where scalability and high performance are crucial.

src/server/network/message/outputmessage.cpp

github-actions · 2024-10-19T02:32:48Z

This PR is stale because it has been open 45 days with no activity.

sonarqubecloud · 2024-10-30T04:58:20Z

Quality Gate passed

Issues
5 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

github-actions · 2024-11-30T02:39:23Z

This PR is stale because it has been open 45 days with no activity.

sonarqubecloud · 2024-12-06T02:56:02Z

Quality Gate passed

Issues
26 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

This comment was marked as outdated.

Sign in to view

jhogberg mentioned this pull request Sep 12, 2024

feat: livestream (cast) system #2653

Open

3 tasks

dudantas reviewed Sep 18, 2024

View reviewed changes

src/server/network/message/outputmessage.cpp Outdated Show resolved Hide resolved

github-actions bot added Stale No activity and removed Stale No activity labels Oct 19, 2024

beats-dh force-pushed the lockerfree branch from 93b526d to a4637a9 Compare October 26, 2024 15:16

beats-dh and others added 4 commits October 29, 2024 19:38

init

ca96030

fix: condition if-statement

ecc3d80

up

7fe9977

test

a51681a

beats-dh force-pushed the lockerfree branch from a4637a9 to a51681a Compare October 29, 2024 23:39

fix

bb8fff2

beats-dh force-pushed the lockerfree branch from 9120cdf to bb8fff2 Compare October 30, 2024 04:53

github-actions bot added the Stale No activity label Nov 30, 2024

beats-dh and others added 3 commits December 5, 2024 21:25

Merge branch 'main' into lockerfree

7073134

update

f1e02d3

Code format - (Clang-format)

e61fcf6

github-actions bot removed the Stale No activity label Dec 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Enhance Memory Management with Lock-Free Allocator, Preallocation, and Optimized Thread-Local Caching #2825

perf: Enhance Memory Management with Lock-Free Allocator, Preallocation, and Optimized Thread-Local Caching #2825

beats-dh commented Aug 17, 2024

This comment was marked as outdated.

github-actions bot commented Oct 19, 2024

sonarqubecloud bot commented Oct 30, 2024

github-actions bot commented Nov 30, 2024

sonarqubecloud bot commented Dec 6, 2024

perf: Enhance Memory Management with Lock-Free Allocator, Preallocation, and Optimized Thread-Local Caching #2825

Are you sure you want to change the base?

perf: Enhance Memory Management with Lock-Free Allocator, Preallocation, and Optimized Thread-Local Caching #2825

Conversation

beats-dh commented Aug 17, 2024

Detailed Description:

1. Introduction of Static Preallocation:

2. Optimization of Thread-Local Cache:

3. Improvements in the allocate Function:

4. Improvements in the deallocate Function:

Reason for Replacing std::make_shared with This New Implementation:

1. Precise Control of Allocation and Deallocation:

2. Optimization for Multithreaded Environments:

3. Reusing Memory Blocks:

4. Reduction of Memory Fragmentation:

5. Flexibility and Extensibility:

This comment was marked as outdated.

github-actions bot commented Oct 19, 2024

sonarqubecloud bot commented Oct 30, 2024

Quality Gate passed

github-actions bot commented Nov 30, 2024

sonarqubecloud bot commented Dec 6, 2024

Quality Gate passed

3. Improvements in the `allocate` Function:

4. Improvements in the `deallocate` Function:

Reason for Replacing `std::make_shared` with This New Implementation: